Skip to content

fix(browser): exclude DNS errors from consecutive failure limit#1156

Merged
vringar merged 5 commits into
masterfrom
fix/dns-error-failure-limit-v2
May 7, 2026
Merged

fix(browser): exclude DNS errors from consecutive failure limit#1156
vringar merged 5 commits into
masterfrom
fix/dns-error-failure-limit-v2

Conversation

@vringar
Copy link
Copy Markdown
Contributor

@vringar vringar commented Apr 4, 2026

Summary

  • DNS resolution errors (dnsNotFound / NXDOMAIN) no longer count toward the consecutive failure limit that stops crawling. The browser still restarts after each DNS error — appropriately cautious for the measurement use case.
  • Extract is_dns_error() as a named, importable function for testability
  • Hold task_manager.threadlock across both the failure_count increment and the threshold compare so the update+compare is atomic across browser threads
  • Add integration test that navigates to failure_limit * 3 nonexistent domains, verifying the crawl continues past failure_limit=5
  • Add unit tests that import and verify the real predicate (not a local copy)

Only dnsNotFound is excluded — DNS timeouts/SERVFAIL intentionally still count as they may indicate real network issues.

Supersedes #1139.
Closes #1116

Test plan

  • test_dns_error_does_not_count_against_failure_limit passes
  • test_is_dns_error_predicate tests the real is_dns_error function
  • pre-commit run --all-files passes

Copilot AI review requested due to automatic review settings April 4, 2026 21:11
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Excludes Firefox dnsNotFound (NXDOMAIN) neterrors from contributing to the consecutive failure limit that stops crawling, and adds tests to ensure crawls continue through large numbers of nonexistent domains.

Changes:

  • Added _is_dns_error() predicate and used it to skip incrementing failure_count on dnsNotFound.
  • Added an integration test that navigates to 110 .invalid domains to verify no failure-limit abort occurs.
  • Added a unit test that imports and validates the real _is_dns_error() predicate.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File Description
openwpm/browser_manager.py Adds _is_dns_error() and uses it to exclude dnsNotFound from the failure counter.
test/test_webdriver_utils.py Adds integration + unit tests validating the new DNS-exclusion behavior and predicate correctness.
.gitignore Adds ignore rules for Crosslink/Claude-generated local state.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread openwpm/browser_manager.py Outdated
Comment thread test/test_webdriver_utils.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.19%. Comparing base (6b1a26c) to head (9228048).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1156      +/-   ##
==========================================
+ Coverage   62.16%   62.19%   +0.03%     
==========================================
  Files          40       40              
  Lines        3898     3902       +4     
==========================================
+ Hits         2423     2427       +4     
  Misses       1475     1475              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vringar vringar force-pushed the fix/dns-error-failure-limit-v2 branch from e6d86c9 to 72ea044 Compare April 7, 2026 23:37
@vringar vringar force-pushed the fix/dns-error-failure-limit-v2 branch from cc064fd to 1ab9595 Compare April 19, 2026 21:16
vringar added a commit that referenced this pull request Apr 19, 2026
@vringar vringar force-pushed the fix/dns-error-failure-limit-v2 branch from 1ab9595 to a67fac3 Compare April 19, 2026 21:29
@vringar vringar requested a review from Copilot May 7, 2026 10:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

Comment thread openwpm/browser_manager.py Outdated
Comment thread test/test_webdriver_utils.py Outdated
Comment thread test/test_webdriver_utils.py Outdated
Comment thread openwpm/browser_manager.py
vringar added 4 commits May 7, 2026 11:27
Extract the inline is_dns_error predicate from execute_command_sequence
into a module-level _is_dns_error() function so the test imports and
exercises the real code instead of a local copy. Remove unused
CommandExecutionError import from the test file.
@vringar vringar force-pushed the fix/dns-error-failure-limit-v2 branch from a67fac3 to 0698f44 Compare May 7, 2026 11:32
- Move failure_count threshold check inside threadlock so increment and
  compare are atomic across browser threads.
- Reduce test domain count from 110 to failure_limit*3 and wrap the
  navigation loop in try/finally for cleanup on failure.

Browser restart on DNS errors is intentional — appropriately cautious
for our use case.
@vringar vringar force-pushed the fix/dns-error-failure-limit-v2 branch from 0698f44 to 9228048 Compare May 7, 2026 11:37
@vringar vringar enabled auto-merge May 7, 2026 11:42
@vringar vringar added this pull request to the merge queue May 7, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 7, 2026
@vringar vringar added this pull request to the merge queue May 7, 2026
Merged via the queue into master with commit ed8efc7 May 7, 2026
14 checks passed
@vringar vringar deleted the fix/dns-error-failure-limit-v2 branch May 7, 2026 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consecutive DNS errors crash a Tranco crawl

2 participants